Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach

نویسندگان

  • Nico Nagelkerke
  • Vaclav Fidler
  • Delmiro Fernandez-Reyes
چکیده

The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of estimation methods for parameters of the probability functions in tree diameter distribution modeling

One of the most commonly used statistical models for characterizing the variations of tree diameter at breast height is Weibull distribution. The usual approach for estimating parameters of a statistical model is the maximum likelihood estimation (likelihood method). Usually, this works based on iterative algorithms such as Newton-Raphson. However, the efficiency of the likelihood method is not...

متن کامل

Discrimination of time series based on kernel method

Classical methods in discrimination such as linear and quadratic do not have good efficiency in the case of nongaussian or nonlinear time series data. In nonparametric kernel discrimination in which the kernel estimators of likelihood functions are used instead of their real values has been shown to have good performance. The misclassification rate of kernel discrimination is usually less than ...

متن کامل

Inference for the Type-II Generalized Logistic Distribution with Progressive Hybrid Censoring

This article presents the analysis of the Type-II hybrid progressively censored data when the lifetime distributions of the items follow Type-II generalized logistic distribution. Maximum likelihood estimators (MLEs) are investigated for estimating the location and scale parameters. It is observed that the MLEs can not be obtained in explicit forms. We provide the approximate maximum likelihood...

متن کامل

Binary Regression With a Misclassified Response Variable in Diabetes Data

Objectives: The categorical data analysis is very important in statistics and medical sciences. When the binary response variable is misclassified, the results of fitting the model will be biased in estimating adjusted odds ratios.  The present study aimed to use a method to detect and correct misclassification error in the response variable of Type 2 Diabetes Mellitus (T2DM), applying binary ...

متن کامل

Estimating the Time of a Step Change in Gamma Regression Profiles Using MLE Approach

Sometimes the quality of a process or product is described by a functional relationship between a response variable and one or more explanatory variables referred to as profile. In most researches in this area the response variable is assumed to be normally distributed; however, occasionally in certain applications, the normality assumption is violated. In these cases the Generalized Linear Mod...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2015